Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for ARM64 Windows #8142

Closed
wants to merge 30 commits into from

Conversation

cocoa-xu
Copy link
Contributor

Hi, this PR includes initial patches and instructions to bring Erlang/OTP to ARM64 Windows. These changes should only affect the behaviour when running in WSL and user executes one of the following command to set up the build environment

  • ./otp_build env_win32 arm64, set up build environment for compiling ARM64 Erlang/OTP natively,
  • ./otp_build env_win32 x64_arm64, set up build environment for cross-compiling ARM64 Erlang/OTP on an x86_64 machine

Although I understand that ARM64 Windows might not be anywhere close to the main tier of support yet, this PR can serve as an initial guide for people in the community who're also interested in running Erlang/OTP natively on ARM64 Windows.

@CLAassistant
Copy link

CLAassistant commented Feb 17, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

github-actions bot commented Feb 17, 2024

CT Test Results

   10 files    248 suites   3h 28m 48s ⏱️
3 029 tests 2 890 ✅ 137 💤 2 ❌
4 468 runs  4 082 ✅ 384 💤 2 ❌

For more details on these failures, see this check.

Results for commit 88a1244.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@cocoa-xu cocoa-xu force-pushed the cx/windows-arm64 branch 4 times, most recently from 1b08767 to 73bc87e Compare February 18, 2024 13:59
@cocoa-xu
Copy link
Contributor Author

cocoa-xu commented Feb 19, 2024

Hi, I have also managed to craft a GitHub workflow that can build it natively on an ARM64 Windows machine, and here is the workflow log, the corresponding CI yaml file and the final installer! This build was based on 6c6b409.

There're still some issues when cross-compiling on x86_64 machine, which I'll look into later.

Other than that, all libraries except wx and odbc build without any issue, ARM64 JIT also seem to work. Although I haven't got time to do any thorough tests yet.

And one more thing is to adjust nsis scripts where relevant, such as defaulting to Program Files (Arm) and adding the vcredist_arm64.exe file.

Copy link
Contributor

@jhogberg jhogberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I've got a few questions in the comments below :)

erts/emulator/sys/win32/erl_win_sys.h Outdated Show resolved Hide resolved
erts/etc/win32/wsl_tools/vc/cc.sh Outdated Show resolved Hide resolved
lib/common_test/configure Show resolved Hide resolved
erts/etc/win32/wsl_tools/vc/emu_cc.sh Outdated Show resolved Hide resolved
erts/etc/win32/wsl_tools/reg_query.sh Outdated Show resolved Hide resolved
@jhogberg jhogberg self-assigned this Feb 19, 2024
@jhogberg jhogberg added the team:VM Assigned to OTP team VM label Feb 19, 2024
@jhogberg jhogberg added the team:PS Assigned to OTP team PS label Feb 19, 2024
@cocoa-xu cocoa-xu force-pushed the cx/windows-arm64 branch 7 times, most recently from b590118 to becdd6d Compare February 19, 2024 13:20
erts/etc/win32/wsl_tools/vc/cc.sh Show resolved Hide resolved
make/autoconf/otp.m4 Outdated Show resolved Hide resolved
otp_build Outdated Show resolved Hide resolved
updated installation guide build with native-ethr-impls enable

updated installation guide
arm64 jit now works on windows
fix cross-compile on x86_64 for aarch64
@cocoa-xu
Copy link
Contributor Author

cocoa-xu commented Feb 21, 2024

And I was wondering if and which ones of these failed test cases should we address in this PR? Perhaps the ones in the emulator, kernel and system?

@jhogberg
Copy link
Contributor

And I was wondering if and which ones of these failed test cases should we address in this PR? Perhaps the ones in the emulator, kernel and system?

The hard crashes must be fixed before we can include this in the nightly test runs, and the rest need to be investigated and fixed before we merge it. :-)

@cocoa-xu
Copy link
Contributor Author

I found that a few of these hard-crashed ones were related to erlang:display_string/{1,2} errored with bad argument. However, the call was unsuccessful because the fd passed to WriteFile is an invalid handle according to GetLastError.

#ifdef __WIN32__
    printf("[debug] we're in display_string_2 win32\r\n");
    HANDLE fd;
    if (ERTS_IS_ATOM_STR("stdout", BIF_ARG_1)) {
        fd = GetStdHandle(STD_OUTPUT_HANDLE);
        printf("[debug] and the BIF_ARG_1 is stdout, fd=%d\r\n", fd);
    } else if (ERTS_IS_ATOM_STR("stderr", BIF_ARG_1)) {
        fd = GetStdHandle(STD_ERROR_HANDLE);
        printf("[debug] and the BIF_ARG_1 is stderr, fd=%d\r\n", fd);
        if (fd == INVALID_HANDLE_VALUE) {
            printf("[debug] and fd == INVALID_HANDLE_VALUE\r\n");
        }
    }
#else
// ...
#endif

#ifdef __WIN32__
        Uint32 w;
        if (!WriteFile(fd, str, len, &w, NULL)) {
            DWORD last_error = GetLastError();
            LPTSTR error_text = NULL;
            FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS, NULL, HRESULT_FROM_WIN32(last_error), MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR)&error_text, 0, NULL);
            if (error_text != NULL) {
                printf("[debug] error msg: %s\r\n", error_text);
                LocalFree(error_text);
            } else {
                printf("[debug] error msg: cannot get formatted error message\r\n");
            }
            goto error;
        }
        written = (Sint)w;
#else
// ...
#endif

The coresponding test case was changed to:

display_string(Config) when is_list(Config) ->
    # using `erlang:display` for comparison and check that we can display to console
    true = erlang:display("hej"),
    true = erlang:display_string("hej"),
    ok.

And I got this output when running this test case:

$ /mnt/c/src/release/aarch64-windows-msvc/bin/erl.exe -noinput -eval "ts:run(emulator, bif_SUITE, display_string, [batch]),erlang:halt()"
Command: erl -sname test_server -rsh ssh -env PATH "c:\src\release\tests\emulator_test;c:\src\release\AARCH6~1\ERTS-1~1.2\bin;c:\Program Files (Arm)\MICROS~1\JDK-16~1.7-H\bin;c:\PROGRA~1\PARALL~1\PARALL~1\APPLIC~1;c:\Windows\System32;c:\Windows;c:\Windows\System32\wbem;c:\Windows\System32\WindowsPowerShell\v1.0;c:\Windows\System32\OpenSSH;c:\PROGRA~1\Git\cmd;c:\PROGRA~3\CHOCOL~1\bin;c:\Users\cocoa\AppData\Local\Programs\Python\PYTHON~1\Scripts;c:\Users\cocoa\AppData\Local\Programs\Python\PYTHON~1;c:\Users\cocoa\AppData\Local\MICROS~1\WINDOW~1;c:\Users\cocoa\AppData\Local\Programs\MICROS~1\bin;c:\PERL_A~1\bin;c:\Users\cocoa\AppData\Local\GITHUB~1\bin;c:\ERLANG~2\bin;c:\Users\cocoa\ELIXIR~1.1\bin;c:\PROGRA~1\MICROS~1\2022\COMMUN~1\VC\Tools\MSVC\1439~1.335\bin\HOSTAR~1\arm64;c:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\arm64;c:\Program Files (Arm)\MICROS~1\JDK-16~1.7-H\bin;c:\PROGRA~1\PARALL~1\PARALL~1\APPLIC~1;c:\Windows\System32;c:\Windows;c:\Windows\System32\wbem;c:\Windows\System32\WindowsPowerShell\v1.0;c:\Windows\System32\OpenSSH;c:\PROGRA~1\Git\cmd;c:\PROGRA~3\CHOCOL~1\bin;c:\Users\cocoa\AppData\Local\Programs\Python\PYTHON~1\Scripts;c:\Users\cocoa\AppData\Local\Programs\Python\PYTHON~1;c:\Users\cocoa\AppData\Local\MICROS~1\WINDOW~1;c:\Users\cocoa\AppData\Local\Programs\MICROS~1\bin;c:\PERL_A~1\bin;c:\Users\cocoa\AppData\Local\GITHUB~1\bin;c:\ERLANG~2\bin;c:\Users\cocoa\ELIXIR~1.1\bin" -env ERL_CRASH_DUMP c:/src/release/tests/test_server/emulator_erl_crash.dump -boot start_sasl -sasl errlog_type error -pz "c:/src/release/tests/test_server" -enable-feature maybe_expr -ct_test_vars "{net_dir,\"\"}" -eval "ts_run:ct_run_test(\"c:/src/release/tests/emulator_test\", [{abort_if_missing_suites,true},{suite,bif_SUITE},{testcase,display_string},{logdir,\"../test_server\"},{config,[\"../test_server/ts.config\",\"../test_server/ts.win32.config\"]},{vars,[{verbose,0}]},batch,{scale_timetraps,true}])"  -noinput -eval "erlang:halt(0,[{flush,false}])."

Common Test starting (cwd is c:/src/release/tests/emulator_test)


Common Test: Running make in test directories...
Recompile: bif_SUITE
Recompile: gen_tcp_dist
Recompile: ignore_cores

CWD set to: "c:/src/release/tests/test_server/[email protected]_22.31.26"

Testing tests.emulator_test.bif_SUITE.display_string: Starting test, 1 test cases
"hej"
[debug] we're in display_string_2 win32
[debug] and the BIF_ARG_1 is stderr, fd=96
[debug] WriteFile went wrong
[debug] error msg: The handle is invalid.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
erlang:display_string failed on line 2139
Reason: {badarg,[{erlang,...},{...}|...]}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Testing tests.emulator_test.bif_SUITE.display_string: *** FAILED test case 1 of 1 ***
Testing tests.emulator_test.bif_SUITE.display_string: TEST COMPLETE, 0 ok, 1 failed of 1 test cases

Updating c:/src/release/tests/test_server/index.html ... done
Updating c:/src/release/tests/test_server/all_runs.html ... done

And I also attempted to use _fileno and _write instead of GetStdHandle and WriteFile but it ended up pretty the same:

#ifdef __WIN32__
    int fd;
    if (ERTS_IS_ATOM_STR("stdout", BIF_ARG_1)) {
        fd = _fileno(stdout);
    } else if (ERTS_IS_ATOM_STR("stderr", BIF_ARG_1)) {
        fd = _fileno(stderr);
    }
#else
// ...
#endif

#ifdef __WIN32__
        written = 0;
        do {
            res = _write(fd, str+written, len-written);
            if (res < 0 && errno != ERRNO_BLOCK && errno != EINTR) {
                printf("[debug] _write went wrong, errno=%d\r\n", errno);
                goto error;
            }
            written += res;
        } while (written < len);
#else
// ...
#endif

The test outputted:

Testing tests.emulator_test.bif_SUITE.display_string: Starting test, 1 test cases
"hej"
[debug] we're in display_string_2 win32
[debug] and the BIF_ARG_1 is stderr, fd=96
[debug] _write went wrong, errno=9

Yet this function seems to be fine when using erl either in PowerShell or WSL:

$ /mnt/c/src/release/aarch64-windows-msvc/bin/erl.exe
Erlang/OTP 27 [RELEASE CANDIDATE 1] [erts-14.2.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Eshell V14.2.2 (press Ctrl+G to abort, type help(). for help)
1> erlang:display_string("hej").
hejtrue

I wonder if we should skip this test for ARM64 Windows or using some alternatives?

@cocoa-xu
Copy link
Contributor Author

I wonder if we should skip this test for ARM64 Windows or using some alternatives?

And for this particular function, erlang:system_info(system_architecture) only returns win32 on all windows systems. Should we consider to change it to aarch64-pc-windows by changing these lines of the echo_env_wsl function in otp_build?

    if [ X"$X64" = X"true" ]; then
        echo_setenv OVERRIDE_TARGET win32 ';'
        echo_setenv CONFIG_SUBTYPE win64 ';'
    elif [ X"$ARM64" = X"arm64" -o X"$ARM64" = X"x64_arm64" ]; then
        echo_setenv OVERRIDE_TARGET aarch64-pc-windows ';'
        echo_setenv CONFIG_SUBTYPE "$ARM64" ';'
	else
        echo_setenv OVERRIDE_TARGET win32 ';'
    fi

@dgud
Copy link
Contributor

dgud commented Feb 21, 2024

Isn't win32 used as the name of the used API and we still use that on arm for windows?

@cocoa-xu
Copy link
Contributor Author

Isn't win32 used as the name of the used API and we still use that on arm for windows?

Oh it's probably just a minor thing. I was trying to find a way to detect the CPU architecture using

erlang:system_info(system_architecture).

For other systems, the result can usually be used to check the CPU architecture, for example, x86_64-pc-linux-gnu or aarch64-apple-darwin23.2.0. But for Windows we have to do another check using

os:getenv("PROCESSOR_ARCHITECTURE").

Besides that, it also doesn't follow the official document:

system_architecture
  Returns a string containing the processor and OS architecture the emulator is built for.

Maybe it's too late and would definitely break a lot of things for x86 and x86_64 Windows, but perhaps we can do it for ARM64 Windows. WDYT?

@cocoa-xu
Copy link
Contributor Author

I still can't figure out why Windows think the file handle we get from GetStdHandle is invalid, but I found a workaround, 2d79abb, and it fixed all failed test cases due to erlang:display_string/2, including:

emulator

  • bif_SUITE/display_string
  • hello_SUITE/hello

kernel

  • logger_disk_log_h_SUITE/mem_kill_new
  • logger_disk_log_h_SUITE/restart_after

I'll upload a screenshot for the emulator and kernel tests shortly.

@cocoa-xu
Copy link
Contributor Author

cocoa-xu commented Feb 24, 2024

Some explanation for commit 892019a

During my tests, I found that high in erts_time_unit_conversion can overflow and cause the test case time_unit_conversion in time_SUITE to fail. For example, on my ARM64 Windows machine when initialising erts_time_sup__.r.o.start_offset.nsec in erts_init_time_sup,

    erts_time_sup__.r.o.start_offset.nsec = (ErtsMonotonicTime)
	erts_time_unit_conversion((Uint64) abs_native_offset,
				  (Uint32) ERTS_MONOTONIC_TIME_UNIT,
				  (Uint32) 1000*1000*1000);

The value of ERTS_MONOTONIC_TIME_UNIT on my machine is 24000000, and abs_native_offset is 576460752312000000. And high will overflow in erts_time_unit_conversion, as the minimal example shown below:

#include <stdio.h>
#include <stdint.h>

uint64_t
erts_time_unit_conversion(uint64_t value,
	uint32_t from_time_unit,
	uint32_t to_time_unit)
{
	uint64_t high, low, result;
	if (value <= ~((uint64_t) 0)/to_time_unit)
		return (value*to_time_unit)/from_time_unit;

	low = value & ((uint64_t) 0xffffffff);
	high = (value >> 32) & ((uint64_t) 0xffffffff);
	
	low *= to_time_unit;
	high *= to_time_unit;
	
	high += (low >> 32) & ((uint64_t) 0xffffffff);
	low &= ((uint64_t) 0xffffffff);
	
	result = high % from_time_unit;
	high /= from_time_unit;
	printf("[!] high=%llu\n", high);
	printf("[!] high will overflow=%s\n",  (high >> 32) != 0 ? "yes" : "no");
	high <<= 32;
	printf("[!] high=%llu, result=%llu\n", high, result);
	
	result <<= 32;
	result += low;
	result /= from_time_unit;
	result += high;
	
	return result;
}

int main(int argc, char *argv[]) {
	printf("%lld\n\n", erts_time_unit_conversion(576460752312000000, 24000000, 1000*1000*1000));
	printf("%lld\n", erts_time_unit_conversion(576460752312000000, 24000000, 1000*1000));
}

The output of the above minimal code is

[!] high=5592405333
[!] high will overflow=yes
[!] high=5572453937501437952, result=9996874
5572453939290448384

[!] high=5592405
[!] high will overflow=no
[!] high=24019196580986880, result=8001996
24019198013000000

The correct value of the final result should be 24019198012999999488 yet it's too large even for uint64_t to hold it. So practically, there's no easy way to get around with it without changing the type of ErtsMonotonicTime.

And this overflow will cause all subsequent calculations depending on ERTS_MONOTONIC_OFFSET_NSEC to give wrong values. One particular case is that when the user wants to use nano seconds as the time unit in erlang:monotonic_time/1.

As for the fix, I think one possible (and perhaps the easiest and the most striaghtforwad) way is that we can simply do the calculations in erlang code. I tried to detect if there's any overflow in the C code and if so, we handle it in erlang code. But after I've done that, I feel that it just adds some unnecessary complexity to the C code.

I hope all this would make sense, but I'm happy to make any other changes to the C code if there's a better solution for it.

@cocoa-xu
Copy link
Contributor Author

I was wondering if I should try to use __int128_t for ErtsMonotonicTime (if __int128_t is available) or just leave it for now?

@IngelaAndin IngelaAndin removed the team:PS Assigned to OTP team PS label Mar 12, 2024
@garazdawi garazdawi mentioned this pull request May 8, 2024
@cocoa-xu
Copy link
Contributor Author

Hi all! As this is kinda stale, I wonder if it would be easier for us if I break this PR into smaller PRs? I can start with PRs that definitely won't break any existing things.

@dgud
Copy link
Contributor

dgud commented Oct 22, 2024

I'm closing this one, since you have started with smaller PR's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants